Add autonomous agent system with Discord bot integration#5
Open
fluidnumericsJoe wants to merge 2 commits intomainfrom
Open
Add autonomous agent system with Discord bot integration#5fluidnumericsJoe wants to merge 2 commits intomainfrom
fluidnumericsJoe wants to merge 2 commits intomainfrom
Conversation
Convert all 9 .claude/agents/*.md definitions to standalone Python agents using the Claude Agent SDK, with a Discord bot replacing Slack for bidirectional simulation operations. New spectre_agents package: - 9 agent classes mirroring .md agents (orchestrator, workflow-runner, stdout-diagnostics, model-output-review, namelist-validator, forcing-data-qc, dashboard-manager, notify, web-research) - 8 tool modules (bash, file_io, slurm, mitgcm, forcing, namelist, dashboard, discord_notify) - Discord bot with slash commands (/run, /diagnose, /review, /validate, /qc, /dashboard, /ensemble, /config) - Interactive decision views (buttons) for orchestrator halting triggers - Systemd service for daemon deployment on Spectre cluster - Complete setup documentation in docs/discord-setup.md https://claude.ai/code/session_01WNamUYvvru6xmxpPLqqW4f
Add a knowledge handler that listens in #ask-mitgcm for natural language questions about MITgcm, ERA5, oceanography, and the codebase. Uses Claude (Sonnet) with a comprehensive system prompt derived from CLAUDE.md, plus WebSearch/WebFetch for live documentation lookups. Runs on the same bot instance as the simulation ops bot — no second token needed. Long answers auto-create threads to keep the channel clean. https://claude.ai/code/session_01WNamUYvvru6xmxpPLqqW4f
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR introduces a complete autonomous agent system for MITgcm simulation orchestration on the Spectre (Franklin) cluster, with Discord bot integration for bidirectional user communication. The system uses Claude AI agents via the Claude Agent SDK to manage the full simulation lifecycle: configuration validation, job submission, failure diagnosis, and recovery.
Key Changes
Core Agent System
spectre_agents/agents/): Nine specialized agents with distinct responsibilities:orchestrator.py: Top-level lifecycle manager coordinating sub-agentsworkflow_runner.py: SLURM job submission and process managementstdout_diagnostics.py: MITgcm STDOUT failure classification and diagnosismodel_output_review.py: Physical plausibility assessment of simulation outputnamelist_validator.py: Pre-run configuration validationforcing_data_qc.py: EXF and OBC binary file validationdashboard_manager.py: Monitoring infrastructure lifecyclenotify.py: Discord notification deliveryweb_research.py: Technical research capabilitybase.py: Common agent infrastructure and tool registrationTool Ecosystem
tools/file_io.py): Read, write, edit, glob, grep operationstools/slurm.py): Job submission, status queries, cancellationtools/mitgcm.py): STDOUT parsing, monitor stats extraction, CFL analysistools/forcing.py): EXF and OBC binary validation with physical range checkstools/namelist.py): Fortran namelist parsing and cross-validationtools/dashboard.py): Monitoring stack health checks and lifecycletools/bash.py): Safe subprocess execution with denylist protectiontools/discord_notify.py): Message posting, image uploads, interactive decisionsDiscord Bot Integration
discord_bot/bot.py): Discord client with command tree and decision queue processingdiscord_bot/commands.py):/rungroup (start, status, stop, resubmit),/diagnose,/validate,/dashboardcommandsdiscord_bot/embeds.py): Color-coded status, failure, health, and decision embedsdiscord_bot/views.py): Decision buttons for user approval flowsConfiguration & Context
config.py): YAML-based configuration with environment variable overrides, per-agent model selectioncontext.py): Shared state between bot and agents, decision queue, simulation state persistencetypes.py): Enums and dataclasses for failures, health status, validation resultsEntry Point & Documentation
__main__.py): Async entry point orchestrating bot and agent runnerdocs/discord-setup.md): Complete walkthrough for Discord bot creation, server setup, secrets configuration, local testing, and systemd service installationsystemd/spectre-agents.service): Service unit for daemon deploymentNotable Implementation Details
ThreadPoolExecutorto keep the Discord bot responsive during long-running operations.spectre-agents-state.jsonfor daemon restart resiliencehttps://claude.ai/code/session_01WNamUYvvru6xmxpPLqqW4f